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Abstract. In this paper we present a new fast algorithm finding minimal reset words for 

finite synchronizing automata. The problem is know to be computationally hard, and our 

algorithm is exponential. Yet, it is faster than the algorithms used so far and it works 

well in practice. The main idea is to use a bidirectional BFS and radix (Patricia) tries 

j^^ ■ to store and compare resulted subsets. We give both theoretical and practical arguments 

^vi ' showing that the branching factor is reduced efficiently. As a practical test we perform 

, an experimental study of the length of the shortest reset word for random automata with 

j^ ■ n states and 2 input letters. We follow Skvorsov and Tipikin, who have performed such 

a study using a SAT solver and considering automata up to n = 100 states. With our 

algorithm we are able to consider much larger sample of automata with up to n = 300 

fT^ ' states. In particular, we obtain a new more precise estimation of the expected length of 

the shortest reset word ~ 2.5\/i ^ 5. 

>, Keywords: Synchronizing DFA, synchronizing word, Cerny conjecture, radix trie 

1—1 

c/3 ■ 1 Introduction 

We deal with deterministic finite automata A = {Q, S, 5) with the state set Q, the input alphabet 

E, and the transition function 5 : Q x S ^ Q. The action of S on Q given by S is denoted 

1^ ' simply by concatenation: S{q, a) — qa. This action extends naturally to the action qw of the 

L^ , words for any w e S* . If \Qw\ — 1, that is. the image of Q by if consists of a single state, then 

QQ ' w is called a reset (or synchronizing) word for A, and A itself is called synchronizing. 

^SJ , The Cerny conjecture states that every synchronizing automaton A with n states has a reset 

word of length < (?i — 1)^. This conjecture was formulated by Cerny in 1964, and is considered 

f^ , the most longstanding open problem in the combinatorial theory of finite automata. So far, the 

04 ' conjecture has been proved only for a few special classes of automata and a general cubic upper 

bound has been established (see Volkov [22] for general motivation and an excellent survey of the 

results, and Trahtman [ 19] for a recently found new cubic bound). Using computers the conjecture 

has been verified for small automata with 2 letters and n < 10 states (and with g < 4 letters 

^ and n < 7 states [H]; see also [T] for n = 9 states). It is known that, in general, the problem 

C^ , is computationally hard, since it involves an NP-hard decision problem. Recently, it has been 

shown that the problem of finding the length of the shortest reset word is FP^^['°sl .complete, 

and the related decision problem is both NP- and coNP-hard [12] (cf. also |2] and |9I10| ). 

On the other hand, there are several theoretical and experimental results showing that most 
synchronizing automata have relatively short reset words and those slowly synchronizing (with 
the shortest reset words of quadratic length) are rather exceptional [T] . An old result of Higgins 
[7] on products in transformation semigroups shows that a random automaton with an alphabet 
of size larger than 2?! has, with high probability, a reset word of length < 2?!. More recently, 
it was proved that a random automaton with n states over an alphabet with n'^-^'^'^, with high 
probability, is synchronizing and satisfies the Cerny conjecture |18| . In computing reset words. 
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either exponential algorithms finding the shortest reset words |17I21I8] or polynomial heuristics 
finding relatively short reset words |6I14I15I16I21| are widely used. The standard approach is the 
breadth-first-search method which starts from the set of all states of the given automaton and 
forms images applying letter transformations until a singleton is reached. Another approach is 
the semigroup algorithm which generates all transitions for increasing words lengths until the 
synchronizing one is found |21| . Based on these ideas computation packages have been created 
(TESTAS ED] and recently developed COMPAS |3]). In [T5], Roman uses a genetic algorithm to 
find a reset word of randomly generated automata and thus obtains upper bounds on the length 
of the shortest reset word. 

A new interesting approach for finding the exact length using a SAT-solver has been applied 
recently by Skvorsov and Tipikin |17] . The problem of determining if an automaton has a reset 
word of length at most I is reduced to the SAT problem and the binary search is performed. 
Using this approach, the following experimental study is done. For chosen numbers n of states 
from the interval [1, 100] random automata with 2 input letters are generated, checked if they 
are synchronizing, and if so, the shortest reset word is computed. The results directly contradict 
the conjecture made by Roman [T5] that the mean length of the shortest reset word for a random 
n-state synchronizing automaton is linear and almost equal to 0.486n. Skvorsov and Tipikin 
argue that their experiment based on a larger set of data shows that this length is actually 
sublinear and « 1.95n°-^^. They have generated randomly and check 2000 automata for each 
n <E {1,2,..., 20, 25, 30, ... , 50}, 500 automata for each 7i e {55, 60, 65, 70}, and 200 automata 
for each nG {75, 80, . . . , 100}. 

In this paper we present a new algorithm based on a bidirectional breadth-first-search. Im- 
plementing this idea requires to solve efficiently the problem of storing and comparing resulted 
subsets of states. To this aim radix tries (also known as Patricia tries [TT]) are used. We analyse 
the algorithm from both theoretical and practical sides. As the first test of efficiency we have 
performed experiments analogous to those done by Skvorsov and Tipikin. We were able to gener- 
ate and check one million automata for each n < 100, 10000 automata for each further n < 220, 
and 1000 automata for each n in the set {225, 230, . . . , 300}. Our data confirm the hypothesis 
that the expected length of the shortest reset word is sublinear, but show that more precise is 
a smaller approximation « 2.5-\/n — 5. In addition, the larger set of data enables us to estimate 
the error and to show that for our approximation with high probability the error is very small. 
We also verify and discuss other results and claims of [T7] . 

Our algorithm is suitable to find the shortest reset words, not only their lengths. Curiously, 
it works in polynomial time for known slowly synchronizing automata series [T]. Since there are 
opinions that random automata with more than 2 input letters may exhibit a different behavior, 
we continue our experiment admitting various alphabets of size k > 2. The results will be 
presented in the forthcoming extended version of the paper. 

2 Algorithm 

The algorithm gets an automaton A = {Q, S, 6) with n states and k input letters. First, A 
is checked if it is synchronizing using the well known (and efficient) algorithm [5]. To find the 
shortest synchronizing word, the main idea is to perform a bidirectional breadth-first-search. The 
steps of the standard BFS and inverse BFS (IBFS) are performed alternatingly. The BFS starts 
from the set of all states Q and computes a list of possible images. The IBFS starts from all the 
single states and computes the list of possible preimages. The branching factor of both of them 
is k. In each step, both BFS and IBFS, computing images or preimages, respectively, creates a 
list of subsets of Q, which are called BFS-list and IBFS-list, respectively. We make use of the 
following 
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Fact 1. Let S,T be subsets of Q. If S ^T, and Ws and Wt are the shortest words over S such 
that \Sws\ = 1 and \Twt\ = 1, then Wg is not longer than Wt- 

Thus, after each step of BFS we can reduce the BFS-Hst to contain only minimal subsets. Each 
step of BFS consists of the following sub-steps. For every subset S in BFS-list we form the images 
of S by all the letters, and for every such image T we check whether we have already visited any 
subset of T. To this aim we maintain the list of subsets visited by BFS, called BFS- visited list. If 
a subset of T is in the BFS- visited list, then T is removed; otherwise it is removed and inserted 
into the BFS-visited list and into the new, created during this step, BFS-list (corresponding to 
the present level of BFS) . This is repeated until the former BFS-list is empty. We note that the 
new resulting BFS list is partially ordered in the sense that each inserted subset never contains 
any earlier inserted subset (otherwise it is removed on an earlier step). This helps to reduce the 
BFS-list by removing subsets that contain other subsets in the list in an efficient way (which will 
be described later). 

Each step of IBFS is performed in an analogous (dual) way and results with the IBFS-list 
reduced to contain only maximal subsets. After each step of BFS or IBFS the current lists are 
compared whether there is a subset in the IBFS-list containing a subset in the BFS-list. If so, 
the algorithm terminates. The number of done steps is the length of the shortest reset word. If 
we store also the sequences of letters used to form images, we may output also the reset word 
itself. 



2.1 Radix tries 

In order to check the required conditions in a fast way, we need a data structure for storing 
subsets of Q and checking for a given subset S if it contains any subset inserted to the structure 
(subset checking). We use the well known radix tries (Patricia tries) to this aim [TT]. Here a radix 
trie is a binary tree of the maximal depth n which stores subsets of a given n-set Q in its leaves. 
Having a fixed order of elements gi , . . . , g„ G Q, each subset S of Q encodes a path from the root 
to a leaf in the natural way: after i steps the path goes to the right child whenever qi g S, and 
goes to the left, otherwise. A radix trie is compressed in the sense that instead it stores a subset 
in the first node that determines uniquely the subset in the stored collection (no other subset 
shares the same path as a prefix). 

The insert operation is natural and can be performed in at most n steps creating at most n 
nodes. The subset checking operation is performed by a depth- first-search checking if the given 
subset S contains a subset stored in the visited leaf. The search does not need to branch into 
the right child of a node if the checked subset S does not contain the corresponding state. It 
performs nm operations in the worst case, where m is the number of subsets stored in a trie, 
but in practice the performance is much better (we discuss it later). The superset operation (for 
IBFS) is done in the dual way. 

In our algorithm we use radix tries for three different tasks. First of all we use them to 
maintain BFS- and IBFS- visited lists and to perform, respectively, subsets or superset checking. 
Next, we use radix tries to reduce current BFS- and IBFS-lists. These are standard lists, but to 
reduce them we form an empty auxiliary radix trie and insert the elements of, say, BFS-list in the 
backward order. Before inserting each element a DFS-check for subsets is performed. Due to the 
fact (mentioned earlier) that the list is partially ordered (larger sets precede their subsets on the 
list) this leads to the minimal list, that is, no subset contains any other subset. The IBFS-list is 
reduced in the dual way. The only substantial difference is that we remove the auxiliary IBFS-trie 
after using, while the auxiliary BFS-trie is kept to be used for the next task. Namely, in checking 
if the current IBFS-list has an element containing a subset in the BFS-list, we iterate over the 
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elements of the former and perform subset checking operation using the auxihary BFS radix trie 
(which contains at the moment the same elements as the current BFS-list). 

2.2 Visited subsets 

The tries of visited subsets are the largest structures in the algorithm, so we try to keep them 
minimal. When we insert a subset which is contained in an already stored subset, we can remove 
the larger one. If we reach a node with the larger set we can just replace it by the smaller one 
and the size of a trie does not grow. This is the case when the prefixes of comparable subsets 
are equal. We do not look for and do not remove other larger sets during the insert operation, 
since we have found it is too time-expensive. Yet, we sometimes reduce the sizes of the BFS- 
and IBFS-visited tries by rebuilding them in the backward order to obtain minimal collections 
of subsets. 

If these tries grow too large such that we run out of memory, then we use a hybrid DFS and 
BFS method. This procedure starts from the list of subsets obtained by the last BFS step. If 
the list is short then it performs one BFS step as in the previous part, but no visited subsets 
are remembered nor checked anymore. If the list is long, then it is split into a few smaller 
lists. Then the procedure calls recursively for each of the lists starting from one containing the 
smallest subsets. The maximal depth of the searching is the currently smallest length of a found 
synchronizing word. In practice, this stage is used only for very large automata. 

2.3 Heuristics 

We use also some heuristics to optimize the algorithm. Before each step we sort the BFS-list in 
the ascending set sizes order and the IBFS-list in the descending order. This helps in skipping 
redundant subsets, since it implies that the smaller subsets are checked before the larger and the 
resulting list (before minimalization) is smaller. Also, using this, we insert less subsets into the 
tries containing visited subsets. 

To decide which step should be performed, the BFS or the IBFS, we compare the sizes of 
both current lists and perform the step for the smaller one. We find it is faster to do with a 
slight overhead of the BFS. We use k = \S\ times larger weight for the IBFS-list. The BFS step 
is slightly faster since in random automata overall size of subsets after x steps are smaller than 
the overall n minus size after x steps if the IBFS. 

Random automata are usually not strongly connected, but contain a single strongly con- 
nected component, and have the property that, when performing the BFS the nodes outside the 
component become quickly unreachable. So at the beginning of the algorithm we perform a few 
steps of the standard BFS and construct a reduced automaton without these unreachable states. 
Then the algorithm is called for the reduced automaton. The reduction usually gets rid of about 
20% states for random automata and after a few steps of BFS a strongly connected one is usually 
obtained (we return to this issue in the next section). 

2.4 Expected time of subsets checking 

The most time-consuming operation in our algorithm is subset checking in radix tries. We have 
tried to estimate theoretically an expected time of this operation. Suppose that T is a radix trie 
containing m random subsets of an n-element universe Q. We assume that a random subset S 
of Q contains each element of Q with probability p (this is referred to as the Bernoulli model) . 
Using this model we have proved the following. 
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Theorem 1 For a random subset S in the Bernoulli model the expected searching time for a 
subset of S in the trie T is 0(?7i'°S2 (i+P)/p). 

Observe, that this results shows an important improvement over the trivial 0{m) comparisons. 
For example in the uniform model (p = 0.5), we have 0(to'°S2 3/2-| ^ 0{ra'^'^^^). We also note 
that the expected height of a trie with m subsets inserted is known to be 0(log2m) (see [l]). 
The expected time of the insert operation is much faster since it depends linearly on the height 
of a trie. 

In the practical execution of the algorithm the probabilities of subset occurrences are how- 
ever quite different than in the Bernoulli model. The following are few significant differences 
(established experimentally) : 

1 . When a subset of S is found in the trie of visited subsets searching is terminated and this 
situation comes more frequently when executing the algorithm than in the Bernoulli model. 

2. The numbers of subsets stored in tries are smaller than the number of subsets inserted into 
them. This is because of the procedure of replacing subsets in case of detected inclusion. 

3. The probability distribution in the real performance is not constant. The sizes of subsets 
reached by the BFS decrease rapidly in the first few steps. This can be compared with 
decreasing values of p in the Bernoulli model. 

These differences are in favor of the real performance and makes the algorithm to work faster. 



2.5 Expected running time 

Bounding the expected running time of the algorithm is difficult due to many optimizations. We 
can however compute a bound for the bidirectional BFS with radix tries and compare it with 
the standard BFS. We use an assumption that the Bernoulli model considered earlier with small 
p is not better than the real distribution model. 

Then we are able to get the estimation of the expected running time of the form 0{ln^k^''^'^). 
The experiments performed showed that the real expected running time is much smaller. The 
reductions of lists to incomparable sets and subset checking work well, and the effective branching 
factor is considerably less than k. 

We have also checked the performance of the algorithm for known slowly synchronizing au- 
tomata: the Cerny automata and the automata #^,^4;^" ^'^^ -^n considered in [T]. Surprisingly, 
all of these are quickly taken by the IBFS. This is due to the checking for visited subsets and the 
fact that IBFS-list contains only one subset during all the steps except of the first. The num- 
ber of stored visited subsets after / steps is so 0{l + n) and a subset-checking does not exceed 
0(71(71 + Z)) operations in the worst case. Thus the running time for them is polynomial and can 
be roughly estimated as 0{ln{n + I)) ~ 0{n^). 



3 Experiments 

We performed a series of the following experiments for variuos n < 300. For a given n, we generate 
a random automaton A with n states and 2 input letters, check whether A is synchronizing and 
if so, we find the minimal length of a reset word using the algorithm described in Section 2. On 
the basis of the obtained results we estimate the expected length of the shortest reset word. 
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3.1 Computations 

In the experiments we have used the standard model of random automata, where for each state 
and each letter all the possible transitions are equiprobable. A random automaton with n states 
and 2 input letters can be then represented as a sequence of 2n uniformly random natural 
numbers from [0,n — 1]. To generate high quality random sequences we have used the WELL 
number generator [T3] (variants 1024 and 19937) seeded by random bytes from /dev/random 
device. We have computed exact results for automata up to 7 states by checking all of them. For 
each 8 < n < 100 we checked one million automata, for each 101 < n < 220 we checked 10000 
automata and 1000 for each n in {225, 230, . . . , 300}. 

The computations have been performed mostly on 16 computers with Intel(R) Core(TM) 
17-2600 CPU 3.40GHz 4 cores and ICB of RAM with occasionally support of up to 15 computers 
with Intel(R) Core(TM) 13 CPU 540 3.07GHz 2 cores and 4GB of RAM. The algorithm was 
implemented in CH — h and compiled with gH — h. Distributed computations were managed by a 
dedicated server and clients applications written in Python. 

3.2 Efficiency. 

The average computation time is about 100 or 1000 times faster than the time of Trahtman's 
program TESTAS |20I21| for automata with 50 states. The reduction to SAT used in [T^ seemed 
to be the fastest recently known algorithm and the given average time for 50 states automata is 
2.7 seconds, and for 100 states automata is 70 seconds. Our comparable results are less then 0.006 
and 0.07 seconds, respectively (we have used faster machines but the resulting speedup should 
be not more than a few times). The table [T] presents the comparison of the average computation 
time and the maximum computation time for random automata from the experimental data set. 

Table 1. The comparison of average computation time and the maximum time for random automata. 



n 


50 


100 


150 


200 


250 300 


TESTAS (120]) 
SAT reduction (11711 


1.4 s 

2.7 s 


time-out 
70 s 


— 


— 


— — 


Our average time 
Our maximum time 


0.005 s 
0.37 s 


0.065 s 
33.78 s 


2.65 s 
360 s 


54.4 s 
1 h 9 min 


12 min 18 s 1 h 2 min 
3 h 41 min 18 h 19 min 



The average times are relatively small because of rare occurrences of „hardly" synchronizable 
automata. So we present also the maximum computation time which is greatly longer than the 
average since it is much dependent of an automaton. Our experiment did not find any really 
„hard" example. 

3.3 General results 

Our experiment confirms that for the standard random automata model A(7i) on the binary 
alphabet the probability that the automaton is synchronizing seems to tend to 1 as the number 
n of states grows: 

P{A{n) is synchronizing) > 1. 

71— >-00 

This conjecture is posed in [17], but we have heard it earlier from Peter Cameron during BCC 
conference in Exeter 2011. For n = 100, 2250 of one million automata turned out to be non- 
synchronizing (0.225%), and for n = 300, only one of 1000 automata. Based on our experiments. 
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the line of synchronization chance in Figure [T] for automata with less then 100 states forms a 
smooth curve which is very Ukely to converge to 1. 



Synchronization 




20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 

Numer of states: n 

Fig. 1. Experimental values of synchronization probability. 



We observe also that random automata are mostly not strongly connected. Moreover, if an 
automaton is synchronizing then the expected size of the strongly connected sink component 
seems to tend to the value w 0.7987n. We also noted that the average length of the minimal 
synchronizing word in a random automaton is usually a little larger than the length in the 
strongly connected automaton formed by its sink component. 

3.4 The expected length of the shortest reset word 

The main result of the experiments is the estimation of the expected minimal length of a syn- 
chronizing word. We consider the sequence of random variables £(n) defined as the length of 
the shortest reset word for a synchronizing automata with n states. By E[£(n)] we denote the 
expected value of i{n), and by V[^(n)] its variance. Let ML{n) denotes the mean length of the 
shortest reset word of the automata with n states generated in our experiment. 

We have observed that the approximation ML{n) « l.QSn"-^^ proposed in [TJ is inflated. 
We have been searching for an approximation function by filling some predefined templates with 
different constants and comparing them by minimizing the sum of squares of differences with the 
experimentally computed estimation. Based on currently available data, we propose a new more 
precise experimental approximation for the expected length 



E[£{n)] w 2.5yU^^. 



(1) 



Comparison of both the proposed functions with the experimental results is presented in the 
Figure [21 We observe that the expected length seems to belong to 0{y/n). 
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Mean length of the shortest synchronizing word 



o 



o 



50 



40 



30 



^ 20 



10 



"T 1 1 r 




Experimental mean length 

Proposed estimation: 2.5*sqrt(n-5) 

Skvortsov, Tipikin estimation: 1.95*n**0.55 

Max found 
Min found 



20 40 60 80 100 



120 140 160 180 200 220 240 
Numer of states: n 



260 280 300 



Fig. 2. The approximations proposed for the expected length of the shortest reset word. 



3.5 Error estimation 

In contrast with the experiments by Skvorsov and Tipikin |17| . our experiments allow to ob- 
tain a good estimation of the approximation error. We make use of the well-known Hoeffding's 
inequality. 

Yet, since the distribution of £(n) is highly asymmetric, one needs to combine this inequality 
with the statistical fact that the maximal lengths of the shortest reset words are much smaller 
than the known bounds and that longer lengths occur rarely. 

Suitable calculations lead to the following: 

Theorem 2 If less than \/k of the n-state synchronizing automata have the length of the shortest 
reset word larger than Af„, then with probability 1 — p 



|AfL(n)-E[£(n)]| < 



Mn(fc-l) 



log(2/p) , n3 



2to 



6k' 



Assuming the Cerny conjecture in the last term n^ /6 may be replaced by (n — 1)^ (giving 
essentially better estimation). 

For n = 100, m = 10® and p = 0.0001 one may show that k > 100975 is as required, and 
consequently, the error is less than 1.75 (or 0.19 assuming the Cerny conjecture). This means that 
with high probability the expected length of the shortest reset word for synchronizing automata 
with n = 100 states is close to our experimental result 24.34. Comparing this with the results of 
Skvorsov and Tipikin |17j . we note that, for automata with 100 states, they also have obtained 
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the expected length close to 24. but taking into account the size of the sample m = 200, no 
reasonable estimation of the error can be obtained in this way (even values of p as large as 
p = 0.1 lead to a few hundred percent error). 

3.6 Distribution and variance. 

The results of our experiment allow to compute an approximated probability distribution of £(n) 
for each tested n. Example distributions are shown in Figure [31 They are very similar for larger 
n. For n — 7 states the exact distribution is presented. 



Distribution of ttie length of the shortest reset word 



0.3 



0.25 



0.2 



S 0.15 



0.05 



n=7 

n=50 

n=100 

n=200 

n=300 



lI. !l »*l«..tiiltltl « i T»<t«l ..( 



15 



25 30 

Word length 



35 



50 



55 



Fig. 3. ProbabiUty distributions. 

We also confirmed the observations from [T7] that the variance V[£(n)] is a growing function. 

We however do not confirm that the fraction "^r^. n, seems to tend to as n goes to infinity. The 
graph we have obtained (not reproduced here) does not exclude the possibility that the fraction 
converges to some positive constant. 



4 Conclusion 

Our algorithm for finding the minimal reset word is significantly faster than the algorithms used 
so far and works well for all the automata we have tested. Its time performance depends mainly 
on the length of the minimum word, but one has to observe that the known slowly synchronizing 
automata with the longest reset words are not the hardest ones for the algorithm. Thus, there is 
a hope to discover new examples of classes of slowly synchronizing automata and to check how 
the situation changes in case of automata with more than 2 input letters. The results of new 
experiments are intended to be reported in the journal version of the paper. We note that there 
are still possibilities to optimize the algorithm, in particular, by designing a faster data structure 
for subset checking and minimalizing subset lists. 
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